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(PNC) into multiple-input multiple output (MIMO) two-way relay channels (TWRCs). At the heart of the proposed 
scheme lies a new key technique referred to as eigen-direction alignment (EDA) precoding. The EDA precoding 



. efficiently aligns the two-user's eigen-modes into the same directions. Based on that, we carry out multi-stream PNC 



over the aligned eigen-modes. We derive an achievable rate of the proposed EDA-PNC scheme, based on nested 
lattice codes, over a MIMO TWRC. Asymptotic analysis shows that the proposed EDA-PNC scheme approaches 
the capacity upper bound as the number of user antennas increases towards infinity. For a finite number of user 
antennas, we formulate the design criterion of the optimal EDA precoder and present solutions. Numerical results 
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show that there is only a marginal gap between the achievable rate of the proposed EDA-PNC scheme and the 
capacity upper bound of the MIMO TWRC, in the median-to-large SNR region. We also show that the proposed 
EDA-PNC scheme significantly outperforms existing amplify-and-forward and decode-and-forward based schemes 
for MIMO TWRCs. 
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I. Introduction 

A two-way relay channel (TWRC), where two users exchange information simultaneously via an 
intermediate relay, can potentially double the throughput of a conventional one-way relay channel [1]. 
Recently, it has been shown that physical-layer network coding (PNC) can achieve within 1/2 bit of the 
capacity of a single-input single-output (SISO) Gaussian TWRC [2], [3], and it is asymptotically optimal 
at high signal-to-noise ratios (SNRs). In the PNC scheme, the two users transmit signals simultaneously 
to the relay. The relay recovers and forwards only compressed information of the two users, rather than 
the complete information. This is in contrast to the well-known amplify-and-forward (AF) [4]-[6] and 
decode-and-forward (DF) based schemes [7] for TWRCs. 

The existing work on PNC is limited to SISO scenarios. It is well-known that multiple-input multiple- 
output (MIMO) systems can provide many advantages over SISO systems, in a rich- scattering environment 
[8]. The challenge is to extend PNC to MIMO TWRCs. In [1] and [2], the PNC scheme required that 
the two-user's signals received by the relay are aligned in the same spatial direction. This condition is 
naturally guaranteed in a SISO Gaussian TWRC [1], [2]. However, in a MIMO environment, each user has 
multiple eigen-modes. The directions of the eigen-modes (referred to as eigen-directions) of the two users 
in the TWRC are different in general. Therefore, the main challenge is to design an efficient technique to 
align the eigen-directions of the two users. This will lead to a practical PNC scheme for MIMO TWRC. 
We will show that the performance can be up to 50% higher in spectral efficiency at practical SNR levels, 
compared with the existing schemes for MIMO TWRCs that do not employ PNC. 

In this paper, we propose a novel eigen-direction alignment (EDA) precoding based PNC scheme for 
MIMO TWRCs. The key of the proposed EDA precoding is that it efficiently aligns the two-user's eigen- 
modes into the same directions. Then, we construct multiple independent PNC streams over the aligned 
eigen-modes established by the EDA precoding. We refer to the proposed strategy as an EDA-PNC scheme. 

We derive achievable rates of the proposed EDA-PNC scheme, based on nested lattice codes [2]. Our 
asymptotic analysis shows that the proposed EDA-PNC scheme approaches the capacity upper bound of 
a MIMO TWRC, as the numbers of user antennas increase towards infinity. For a finite number of user 
antennas, we formulate the design criterion of the optimal EDA precoder, which leads to a non-convex 
optimization problem. For a relatively small spatial dimension, we develop an exhaustive search method to 
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obtain the optimal EDA precoder. For a larger spatial dimension, we derive approximate solutions to the 
optimization problem. Numerical results show that there is only a marginal gap between the achievable 
rate of the proposed EDA-PNC scheme and the capacity upper bound of the MIMO TWRC, in the median- 
to-large SNR region. We also show that the proposed EDA-PNC scheme significantly outperforms the 
existing AF- and DF-based schemes for MIMO TWRC. 

The paper is organized as follows. In Section II, we depict the model of a MIMO TWRC and a 
two-phase transmission protocol. In Section III, we derive a capacity upper bound and briefly discuss two 
existing schemes. In Section IV, we propose the EDA-PNC scheme. In Section V, we derive the achievable 
rate of the proposed scheme and present an asymptotical result. The design criterion of the optimal EDA 
precoder is also given in Section V. In Section VI, we discuss sub-optimal EDA precoders. Numerical 
results are shown and discussed in Section VII. Finally, we draw the conclusions in Section VIII. 

II. System Model 

In this section, we introduce the modelling of a MIMO TWRC and describe a two-phase transmission 
protocol. We focus on a real-valued model in this paper. The extension of our results to a complex- valued 
model is straightforward, as detailed in Appendix I. 

A. Configuration of a MIMO TWRC 

A MIMO TWRC, in which user A and user B exchange information via a relay, is illustrated in Fig. 
[TJ Each user is equipped with n T antennas and the relay has n R antennas. All the channels in the system 
are assumed to be flat-fading within the bandwidth of interest. The channel from user A (or B) to the 
relay is denoted by an n R -by-n T matrix Ha,r (or H B R ). The channel from the relay to user A (or B) 
is denoted by an n T -by-n R matrix Hr^ (or H RiS ). 

The users and the relay operate in half-duplex mode. There is no direct link between the two users. The 
transmission protocol employs two consecutive equal-duration time-slots for each round of information 
exchange between the users via the relay. Each time-slot consists of n channel users. In the first time- 
slot (uplink phase), the two users transmit to the relay simultaneously and the relay remains silent. In 
the second time-slot (downlink phase), the relay broadcasts to the two silent users. We assume that the 
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channel coefficients remain the same for each round of information exchange. We also assume that the 
channel matrices are globally known by both users, as well as by the relay. 

In this paper, we will only consider the situation of tit > n R . This configuration applies to practical 
scenarios such as a wireless sensor network where the physical sizes of the intermediate sensor nodes are 

n 

smaller than those of the terminal nodes^. 

B. Uplink Phase 

The discrete channel of the uplink phase can be written as 

Y R [I] = IIa,rX a [I] + H B , R X B [I] + Z R [I] , I = 1, • • • , n, (1) 

where X m [I] is an n T -by-l column vector with the ith entry x mji [I] , % — 1, • • - , n T , being the coded signal 
transmitted from antenna i of user m, m e {A, B}, at time instant I; Yr [I] is an n#-by-l column vector 
with the jth entry y R j [I] , j — 1, • • • , n R , being the signal received from antenna j of the relay; Z R [I] is an 
riR-by-1 additive white Gaussian noise (AWGN) vector at the relay with the jth entry z R j [I] ~ J\f (0, a 2 R ) , 
j — 1, • • • , ur, where o\ is the noise variance. For notational simplicity, the time index / may be omitted 
in situations without causing ambiguity. 

The channel input covariances of the two users are denoted by Q m = £ (X m X^ , m G {A, B}, where 
£ (•) stands for the expectation operation. The power constraint of the uplink phase is given by 

Tr {Qa + Qb} < Pt (2) 

where Pt is the total transmission power of the two users. The average per-user SNR of the uplink phase 
is defined as 

SNR ^ -5p (3) 

C. Relay's Operation 

Upon receiving Y R = [Y R [1] , • • • , Y R [n]], the relay generates a signal matrix X fi = [X R [1] , • • • , X R [n]]. 
Here, Xr [I] is an n R -by-l real vector with the jth entry xrj [I], j = 1, • • • , n R , being the signal transmitted 

'The situation of tiy < nn will be addressed in our future work. 
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from the jth antenna of the relay, at time instant I, in the downlink phase. In general, the relationship 
between and Y R can be written as 

X fl = f R (Y fl ) (4) 

where f R (■) denotes the relay's functionality. The relay's power constraint is given by 

Tr{Q/?} < Pr (5) 

where Q R = £ (XrX^) denotes the channel input covariance matrix of the relay in the downlink phase. 

Remark 1: In this paper, the power constraints under consideration are given by ([2]) and (J5J). The 
generalization to the case with a global sum-power constraint can be readily done by trading off the 
portion of power allocated to the users and that to the relay. 

D. Downlink Phase 

During the downlink phase, the signal X R = [X R [1] , • • • ,X R [n]] serves as the channel input and is 
broadcast to users A and B. The signals received by user m, m E {A, B} , are given by 

Y m [/] = H R>m X R [l] + Z m [l] , I = 1, • • • , n, (6) 

where Z m [I] is an 7i T -by-l AWGN vector with the ith entry z m i [I] ~ M (0, a 2 m ), i — 1, ■ • • , n T , where 
cr^j is the noise variance at user m. Upon receiving = [Ya [1] , • • • ,Ya [n]], user A decodes user 
B's message with the help of the perfect knowledge of = [X A [1] , • ■ ■ ,X A [n]]. Meanwhile, similar 
operations are performed by user B. This finishes one round of information exchange. 

For notational simplicity, we assume a 2 R — <j\ — a 2 B — 1 in this paper. Then, the average per-user SNR 
is SNR = P T /2 and the SNR of the relay is SNR R = P R . The extension of our results to the case of 
unequal noise power is straightforward. 

III. Capacity Upper Bound and Existing Schemes for a MIMO TWRC 
A. Definitions 

The achievable rate-pair and rate-region of a MIMO TWRC are defined as follows: 
Definition 1: A rate-pair (Ra, Rb) is said to be achievable if there exists a set of 2 nRA codewords for 
user A, a set of 2 nRB codewords for user B and a relay functionality ~K R = f R (Y R ) , satisfying power 



constraints © and ©, such that the decoding error probabilities approach zero at both user nodes of the 
TWRC, as n -»■ oo. 

Remark 2: The rate of each user is defined as the amount of transmitted bits in each transmission 
round, normalized by the duration of one phase (consisting of n channel uses). 

Definition 2: The achievable rate-region 1Z is defined as the convex closure of all achievable rate-pairs. 

B. Capacity Upper Bound of a MIMO TWRC 

We now derive a new capacity upper bound (UB) for a MIMO TWRC. We present the result in the 
following lemma, which is an extension of the cut-set bound for a SISO TWRC [2]. 

Lemma 1: For given input covariance matrices Q^, Q B and Q R , the achievable rate-pair of a MIMO 
TWRC is upper bounded by 

Ra <R U a B = \ mm [log det (i + H^Q A H^) , log det (i + H R>B Q R Hl >B )] (7a) 

Rb <R U b B = \ min [log det (i + H^QsH^) , log det (i + Hj^Q fl H^)] . (7b) 
Proof: From the cut-set bound, the achievable rate-pair of a TWRC is upper-bounded by [2] 

Ra < R U A B = mm {I(X A ; Y R \X B ), I(X R ; Y B )} , (8a) 
Rb < R U B B = mm{I(X B ;Y R \X A ),I(X R ;Y A )} . (8b) 

Applying the capacity formula of a real- valued MIMO channel [12] for given input covariances Q^, Q B 
and Q R , we obtain ©. ■ 
With the result in Lemma [Q the capacity UB of a MIMO TWRC can be determined by optimizing^ 
the covariance matrices Q A , Q B and Q^. This capacity UB provides an upper limit on the data rate that 
any MIMO two-way relay scheme can achieve. 

C. Analog Network Coding for a MIMO TWRC 

Much progress has been made in developing communication strategies to approach the capacity of a 
MIMO TWRC. Among those, an AF-based scheme, namely analog network coding (ANC) [4]-[6], has 
attracted a great deal of attention. In ANC, the relay broadcasts an amplified version of its received signal 

2 This is a convex optimization problem which can be easily solved using a standard tool, e.g., [26]. 
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to the two users. The maximum achievable rate of ANC in MIMO TWRC remains unsolved and a sub- 
optimal solution is reported in [6] . In this paper, we will consider the upper bound of the achievable rate 
of ANC, derived in [33], as a benchmark for comparison purpose. 

The ANC scheme has two disadvantages. First, it suffers from noise amplification, since the noise 
received by the relay is not suppressed before the signal is forwarded to the users. Second, it suffers 
from unnecessary power consumption at the relay, since the AF relay forwards a linear, rather than an 
algebraic, superposition of the signals from the two users [1]. 

D. DF with Network Coding for a MIMO TWRC 

A DF-based scheme has also been studied for a MIMO TWRC [7], [17]. In the DF-based scheme, the 
relay completely decodes both users' messages. The decoded messages of two users are re-encoded with 
a network code [1], [9], and a channel code. The resultant coded signal is broadcast to the two users in 
the downlink phase. We refer to this scheme as DF with network coding (DF-NC). 

The achievable rate of the DF-NC scheme is briefly discussed as follows. The uplink phase of DF-NC 
can be viewed as a MIMO multiple-access channel whose exact achievable rate-region is still an open 
problem, although its upper and lower bounds are studied in [20]. We will use the upper bound in [20] 
for comparison purpose. The downlink rate-region of the DF-NC scheme can be obtained by extending 
the result in [2] to a MIMO scenario. The overall achievable rate-region of the DF-NC scheme is the 
intersection of the uplink and downlink rate-regions determined above. 

The DF-NC scheme suffers from a severe multiplexing loss [3], [7], as complete decoding at the relay 
is demanding and unnecessary. As a result, the achievable rate of the DF-NC scheme may far below the 
capacity of a MIMO TWRC, especially in the high SNR region [7]. 

IV. ElGEN-DlRECTION ALIGNMENT BASED PHYSICAL-LAYER NETWORK CODING 

In this section, we propose a new strategy for MIMO TWRCs. The proposed strategy consists of two 
key components: eigen-direction alignment (EDA) precoding and physical-layer network coding (PNC). 
In particular, the proposed EDA precoding algorithm efficiently aligns the eigen-directions of two users. 
Then, we carry out multi-stream PNC over the aligned eigen-modes established by the EDA precoding. 



To illustrate the proposed EDA precoding algorithm, we first describe a straightforward (naive) method 
to perform the eigen-direction alignment. 

A. A Naive Eigen-Direction Alignment Approach 

Denote by and F B the linear precoding matrices of user A and user B, respectively. The users' 
transmitted signals can be written as 

X m [l]=F m C m [l},me{A,B},l = l,--- ,n, (9) 

where C m [I] = [c m> i [/],•" > c m,n R [^]] T , £ (C m C^) = I, is a length-n R column vector whose entries 
denote the independently coded signals. As a straightforward approach, the precoder performs channel 
inverse, i.e., F m is given by£ 

F m = (H^H^J ~ l * m , m E {A, B} , (10) 

where ~Hj nR (H m ^ R H.^ n R ) 1 is the Moore-Penrose pseudo-inverse of H m fi (for n T > n R ) and * m is an 
n B xn R diagonal matrix which allocates power among the n R eigen-modes for user m. With (fTOl) . the 
signal received by the relay in (OQ) can be written as 

Y R [I] = Ha,rFaCa [I] + Hb,rFbC b [/] + Z R [I] (11a) 

= ^ A C A [I] + ^ B C B [I] + Z R [I] . (1 lb) 

Eq. (II lbl) represents n R parallel sub-channels, as both \I> a and ty R are diagonal matrices. The above 
approach is referred to as a naive EDA precoding. Unfortunately, it is well-known that the channel inverse 
in precoding suffers from a significant power loss when the channel matrix is ill-conditioned [12]. Thus, 
this approach may not be an efficient method to align the eigen-directions. 

B. Proposed Eigen-Direction Alignment Precoding 

Now, we propose our new EDA precoding algorithm which can effectively avoid the power loss suffered 
by the naive EDA precoder. Consider an invertible linear transformation of the relay's received signal as 

Y R [l] = K-%[Z] (12) 

= K- l U A>R X A [I] + K- l U B , R X B [I] + K-'Zj, [/],/ = 1, • • • , n, 

3 In general, a randomly generated Ha,r (or Ub,r) with nr > tir is of full row-rank with probability 1 [12]. For simplicity of discussion, 
we always assume that Ha,r and Hb,h are of full row-rank. 



where K is an n R -by-n R invertible square matrix referred to as the rotation matrix. The equivalent channel 
matrices are now given by 

H m , i? = K- 1 H m , i? ,mG {A,B}. (13) 

Applying the aforementioned naive EDA precoding over the equivalent channel in (fT3l) , we obtain the 
proposed new EDA precoding matrix as 

= ^(H^Hy^Kf^mGHB}, (14) 
The signal received by the relay in (Q} can then be written as 

Y R [I] = H AR F A C A [I] + H BiR F B C B [I] + Z R [I] = K (V A C A [l] + V B C B [I]) + Z R [/] (15) 
where I — 1, • • • , n. At the relay, after the linear transformation (fT2j) . we obtain 

Y R [I] = * A C A [I] + ^ B C B [I] + Z R [I] (16) 

where Z R [I] = Y^r x Z R [I] is the equivalent noise vector. From (fT6l) . it is clear that n R aligned eigen-modes 
are established. Note that we can always scale the entries of Y R such that the equivalent noises of all 
eigen-modes have unit power. Thus, without loss of generality, we confine the rotation matrix K that the 
diagonal elements of K -1 (K _1 ) T are 1, i.e., 



K- 1 (K _1 ) T 



= I- (17) 

diag 



This is to ensure that the entries in the effective noise vector Z R [I] have unit power. 

The proposed EDA precoding scheme reduces to the naive EDA scheme by letting K = I. By varying 
the rotation matrix K, we can actually align the eigen-modes of the two users into any n R pre-determined 
directions in the n R -dimension vector space, as illustrated in Fig. [2l An immediate question is how to 
determined the optimal rotation matrix K. We will retain the answer to this problem till the next section. 



C. The Overall Proposed EDA-PNC Scheme 

We now describe a multi-stream PNC scheme. In the uplink phase, the proposed EDA precoding (fT4~l) is 
employed to establish n R aligned parallel sub-channels. The two users perform single-stream PNC for each 
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aligned sub-channel, and there are n R independent PNC streams in total. Similarly to the case of SISO 
PNC [2], the relay recovers the n R bin-indices (as defined in [2]) instead of completely decoding both 
users' individual messages. In the downlink phase, the aggregation of the n R bin-indices is re-encoded 
and broadcast to the two users. Finally, each user recovers the other user's message with the help of the 
perfect knowledge of its own message. 

V. Achievable Rates of the Proposed EDA-PNC Scheme for MIMO TWRCs 
A. Achievable Rate-Pair 

We now present a theorem on the achievable rate-pair of the proposed EDA-PNC scheme. Define 



x 



max (x, 0) . 



Theorem 1: For given K, \I> A , * b and Q R , an achievable rate-pair of the proposed EDA-PNC scheme 
is given by 

R A < min {R A ^j L , r a!dl} 
R B < min {Rbjjl, Rbjxl] 



(18a) 
(18b) 



where 



R 



EDA 
A,UL 



1 riR 



i=i 



tdEDA 
n B,UL 



1 ' _ 

9T, 



i=l 



log 



log 



A{l,l) 



^ A (l,l) 2 + ^ B (l,l) 2 



®b (i,iY 



+ ^a 



+ ^b 



R EDA 



AJJL ' 2 lo g det (! + Hr,bQrH.R, B ) , 



R 



EDA 
B,DL 



-logdet (I + H_ RtA Q R H RA ) . 



(19a) 

(19b) 

(19c) 
(19d) 



The proof of Theorem \T\ is given in Appendix II. The main idea of the proof is to utilize the results 
on nested lattice codes in [2]. 



B. An Asymptotic Result on the Achievable Rate-Pair 

We next derive an asymptotic result which is based on the following observation. 
Fact 1: Assume that the entries of the channel matrix H m R are i.i.d. with zero mean and unit variance. 
Then, 

— H m , R H^ R A I, as n T -> 00, (20) 
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p 

where ' — y' represents convergence in probability. 

The above result is straightforward by invoking the weak law of large numbers. 

Theorem 2: Assume that the channel coefficients in H^^ and Hb,r are i.i.d. withe zero mean and unit 
variance. As n T tends to infinity (while % remains finite), the proposed EDA-PNC scheme asymptotically 
achieves the capacity of a MIMO TWRC in probability. □ 

The proof of Theorem [2] is given in Appendix III. Theorem [2] states that the proposed EDA-PNC scheme 
achieves the capacity upper bound of a MIMO TWRC with probability 1 as tit — > oo. This asymptotic 
result will be verified by the numerical results presented later. 

C. Determining the Achievable Rate-Region 

Here, we consider the achievable rate-region of the proposed EDA-PNC scheme, based on the results 
of Theorem [TJ Define the following rate-regions 

^■ul A — {(Ra,Rb) ■ Ra < Rajjl^Rb < Rb D ul] > (21a) 
^dl A — {(Ra, Rb) ■ Ra < R E a D dli r b < Rb D dl] ■ (21b) 

The above two rate-regions will be respectively determined in the following. 

1 ) Uplink Rate-Region: The boundary of the uplink rate-region TZ§i A can be determined by solving 
the following weighted sum-rate (WSR) problem 

max {aR^i + (1 - a) R^i} (22a) 

subject to 

Tr ( (H AR H T AR ) - 1 K* 2 A K T + (H^H^) ~ l K*|K T ) < P T . (22b) 

and (fT71) . for < a < 1. Note that the power constraint in (I22bl) is obtained by substituting (fl4l) and 
Q m = F m F^, m G {A, B}, into ©. 

The problem in (|22j) is non-convex and hence is difficult to solve. For a small ur, e.g., ur = 2, 
the optimal parameters (K,^ b) can be found by an exhaustive search. Unfortunately, this method 
quickly becomes prohibitively complex as tir increases. We will provide approximate solutions to this 
problem in Section VI. 
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2) Downlink Rate-Region: The boundary of the downlink rate-region TZf/l A can be determined by 
solving 

max {uRl% + {l-a)R E B %} (23) 

Qr- Tr(Q fl )<P fl 

for < a < 1. Note that logdet (•) is concave and thus the objective function in (1231) is concave in Q#. 
In addition, Tr(Q^) < P R is a linear constraint. Therefore, (1231) can be solved using convex optimization. 

3) Overall Rate-Region: The overall achievable rate-region of the proposed EDA-PNC scheme is the 
intersection of K§j> A and Tl^l A . 

The major difficulty in determining the above achievable rate-region of the proposed EDA-PNC scheme 
is to solve the WSR problem in (l22l) . In the next section, we will provide two suboptimal solutions to 
this problem. 

VI. Approximate Solutions to the Optimal EDA Precoder 

A. Approximate Solution I 

To simplify the problem in (|22l) . we introduce two extra constraints on the proposed EDA precoder: 1) 
The rotation matrix K is unitary, i.e., 

KK T = I, (24) 

and 2) The power matrices satisfy 

V A = S, V B = 7 S, (25) 

where 7 is a positive scalar and S is a diagonal matrix with non-negative diagonal elements. Although 
these extra constraints may lead to a certain performance loss, a close-form solution then exists, which 
yields crucial insights into the design of the EDA precoder. Later, we will consider the relaxation of these 
two constraints to obtain a better approximate solution. 

1 ) Optimal Unitary Rotation Matrix K (for = T'I'a)-' Now we derive the most power-efficient 
unitary rotation matrix K for given \I> a = £ and ^3 = 7S. The problem is formulated as 

Kg' 7) =arg min Tr (F A F^ + F B Ff ) (26) 

K: KK =1 

Let the singular value decomposition (SVD) of the channel matrix H m R be 

H mji? = U m S m V^, m e {A, B} , (27) 



s- 1 
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where U m and are unitary matrices and E m is an n^-by-n^ diagonal matrix with positive diagonal 
elements. Denote by E" 1 the pseudo-inverse of £ m , i.e., 

•y — 1 

m 

Q(n T -n R )xn R 

where Y m is an n R -by-n R matrix formed by the first n R columns of S m and 0( nT _ nfl ) xrtii denotes an 
(n T — n R )-by-n R matrix with all-zero entries. For notational simplicity, we denote (S m S^) 1 by £~ 2 . 
Then, using (fT4l) and (1271 ), the problem (|26l ) becomes 

Kg' 7) = arg min Tr ((U A £ A 2 U^ + 7 2 U B £^U£) KSK T ) . (28) 

K: KK =1 

Define 

G ( 7 ) 4 U A S A 2 U^ + 7 2 U B £ B 2 U£. (29) 

The eigen-decomposition of G (7) yields G (7) = U G ( 7 ) Ag^Ugu where A G ( 7 ) is a diagonal matrix 
with the diagonal entries arranged in the ascending order, and Ug( 7 ) is a unitary matrix. Without loss of 
generality, we always assume that the diagonal entries of £ are arranged in the descending order. Now, 
we present the optimal unitary rotation matrix K in the following theorem. 

Theorem 3: For any given = S and = 7S, the solution to the problem in (|28l is 

Kg = U G{7) . (30) 
Proof: With (|29l , the objective function in (|28l is written as 

Tr (G (7) KS 2 K T ) = Tr (U G(7) A G(7) U£ (7) K£ 2 K T ) < Tr (A G(7) S 2 ) . (31) 



where the equality in the last step holds when K = Ug( 7 ). The inequality in 011 ) follows the fact [21], [32]: 
for any two hermitian matrix M and N with eigen decomposition M = UmAmU^ and N = UatAatU^, 

Tr(MN) < Tr(A M Ajv) (32) 

where the diagonal elements of A M and those A N are reversely ordered. This finishes the proof. ■ 
We have the following comments on Theorem [3l 

Remark 3: The optimal unitary rotation matrix is dependent of 7, but not of S. Thus, we write 
Kg instead of k£' 7) . 
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Remark 4: With K^j = U G ( 7 ), the power constraint in © can be expressed as 



Tr 



Tr(A G(7) S 2 ) <P T . 



(33) 



2) Uplink Rate-Region Revisited: Here, we present an approximate solution to the the uplink achievable 
rate-region TZ§^ A of the proposed EDA-PNC scheme. With = £ and ^ B = 7S, (|19ab and (|19b| ) 
become 



1? 



A,UL 



R 



EDA 
B,UL 



1 

2 ^ 

4=1 



log 



log 



I + 7 2 
2 



7 



1 + 7' 



+ 7 2 S(z,z) 



(34a) 
(34b) 



where £(z,z) denotes the ith diagonal entry of S. 
Correspondingly, the WSR problem in (|22l) becomes 



subject to 



max {aR^i + (1 - a) R™ A L } 



Tr(G (7) KS 2 K J ) < P T and KK = I. 



(35a) 



(35b) 



For the above problem, if the optimal (K, £) couple for any given 7 can be found, the optimal solution 
to d35l) can be easily determined by a one-dimension full search over 7. 

We next determine the optimal (K, £) couple for an arbitrarily given 7. The optimal unitary rotation 
matrix K to the problem in (1351) . for a given 7, is presented in the following lemma (which is a direct 
result of Theorem [3]). 

Lemma 2: Given 7, the optimal K to the maximum WSR problem in (|35l) is = U G ( 7 ) given in 

dSB. 

The remaining task is to find the optimal diagonal matrix £. The optimization problem in (|35al) can 
be equivalently written as 



a 

max , 
s 1 2 



i=l 



log 



1 +S(z,z) 2 



1 — a \ — •> 



i=i 



log 



1 + 7' 



+ 7 2 £(M) ; 



(36) 



l + 7 2 

subject to Tr(A G(7) S 2 ) < P T (cf., (|33|)). 

The objective function d36l) involves [-] + operations, and thus is not concave. However, if we know 
in advance which [■] + operations should be activated, (1361) can be converted into a convex optimization 
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problem. Consider any two index subsets S m C {1, . . . , n R }, m £ {A, B}. We formulate the following 
problem 

1 — a 



a 

max , 
s 1 2 



log 



1 +£(M) 2 



1 + 7 2 



E 



log 



7 , 2W_- ,\2 



l + 7 : 



+ 7 2 S(i,i) 



(37) 



subject to Tr(A G ( 7 )S 2 ) < P T . The solution to the above problem is given in the following lemma, with 
the proof given in Appendix IV. 

Lemma 3: For given Sa, Sb and 7, the solution to the problem in (1371) is given by 



£gf 5s ' 7) (M) 











) + 







1 














) + 


(2AA G(7) (i 





1 


f7 2 , 




/ 1-a 








) + 


V 2AA G( 7 )(« 





1 


f7 2 , 





if i £ Sa and i £ S B 
if i £ Sa and i S B 
if i £ £4 and i £ S B 
if i ^ Sa and z ^ Sb 



(38) 



where A is a real scalar satisfying 



E a gmM ^' 5s ' 7) (m: 



i=l 



(39) 



Lemma [3] yields the optimal power matrix S^' 5b ' 7 ^ for given Sa and Sb- The optimal power matrix 
Sgpt can be found by evaluating S^f ' 5s,7 ' ) for all possible {Sa, Sb}- 
We now conclude the solution to (1351) in the following theorem. 
Theorem 4: For any given 7 , the optimal (K, £) to the problem in (1351) is given by 

K = Kg and £ = E« 

where is the optimal S^f ' 5s ' 7 ^ over all possible {Sa, Sb}, and is given in (1301) . 

Proof: This follows directly from Lemma 2 and Lemma [3] ■ 
To solve ((361) more efficiently, we may confine Sa = Sb- Then, the solution is given by 



1 



2AAo (7) 1 + 7 5 



(40) 



It is observed from numerical results that the extra constraint of Sa = Sb incurs un-noticeable performance 
loss. Finally, we perform a one-dimension full search over 7 which yields the approximate solution. This 
algorithm is summarized as the approximate solution I below. 
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Approximate Solution I 

for 7 = to 1 and 1/7 = 1 to 0, with a step 5 
compute Ag( 7 ) using (12 91) 

compute using (14 01) 

compute R E A D V A L and R% D V A L in ([34aD and (f34bl) 
backup the corresponding WSR 

end 

find the highest WSR in the backup 



B. Approximate Solution II 

The approximate solution I (AS-I) relies on two constraints: 1) KK T = I and 2) \I> a = S, ^ B = 7S. 
We next relax these constraints to improve the AS-I. 

We start with the first constraint. Recall the optimization problem in (|28l) . We may ask what is the 
optimal rotation matrix K while relaxing the unitary matrix constraint. This problem is formulated a$\ 

minTr(G(7)KS 2 K T ) (41a) 



s.t. 

— 1 lv-l\T 



K 1 (K- 1 ) 



= I. (41b) 

J diag 



We present a solution the this problem, with the proof given in Appendix V. 

Lemma 4: For \I> a = S and ^ b = 7S with S given by (|40l) . the optimal rotation matrix K for the 
problem in (I41al) and (I41bl) is given by K { J p \ in (|3Q|) . 

The above lemma means that, given the power matrices *&a and obtained from AS-I, it is impossible 
to find a more power-efficient K than the unitary one given by (|30l) . 

4 The constraint d41b| l implies that, for any given S, the achievable rate pair of the EDA-PNC scheme is fixed. 
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We next relax the second constraint. With K given by (1301 ), we optimize the power matrices *& A and 
ty B without confining to ^ B = j*S? A . The corresponding WSR problem is written as 

max {aR E A D v A L + (1 - a) R$° A L } (42a) 

subject to 

Tr (v A ^ A 2 V T A K^\ (Kg) T + U B S B 2 U|Kg*| (k«) T ) < P T . (42b) 

The objective function in (|42al) is concave in SI/^ and ^ B if we set the term ^3^2 of R A ^/l in (|19a|) , 
an( l < t > 2^ i> 2 of -Rf ^/l in (|19b|) , to be pre-determined constant matrices 6 and I — 0, respectively. The 
solution can then be found by recursively solving (I42al) by fixing (which is a convex optimization 
problem), and then updating 6 using the new solution of A an d ^b- The details are tedious and thus 
omitted here. 

We summarize approximate solution II as follows: 
Approximate Solution II 

Given the 7 and from Approxi . Solution I 

solve the problem in (I42al) and (I42bl) 



VII. Numerical Results 

In this section, we provide numerical results to evaluate the performance of the proposed EDA-PNC 
scheme for MIMO TWRCs. In simulation, we always assume that the relay SNR and the average per-user 
SNR are identical, i.e., SNR R = SNR. The results presented below are obtained by averaging over 1,000 
channel realizations. 

A. Achievable Sum-Rates of MIMO TWRCs with n T > n R = 2 

Here, we present the numerical results for real- valued MEMO TWRCs with n T = n R = 2. The 
coefficients in the channel matrices are independently drawn from A/"(0, 1). The optimal rotation matrix 
K and the optimal power matrices ^ A and ^ B are found by utilizing the exhaustive search method. 
The achievable sum-rate of the proposed EDA-PNC scheme is plotted in Fig. [3] The sum-capacity UB 
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of the MIMO TWRC, the achievable sum-rate UBs of the ANC and DF-NC schemes, as well as the 
achievable sum-rate of the naive EDA-PNC (with K = I) scheme, are also included for comparison. In 
the high SNR region, we observe that the gap between the achievable sum-rate of the proposed EDA-PNC 
scheme and the sum-capacity UB of the MIMO TWRC is very small, e.g., less than 0.3 bit/Sec/Hz in 
spectral efficiency, or less than 0.4 dB in power efficiency, at a SNR greater than 15 dB. We also see that 
the proposed EDA-PNC scheme significantly outperforms the ANC, DF-NC and the naive EDA scheme. 
Specifically, the ANC scheme suffers from a significant power loss of about 3-4 dB compared with the 
proposed EDA-PNC scheme. The DF-NC scheme suffers from a severe multiplexing loss, as the slope 
of its performance curve is nearly halved compared to the other schemes. In the low SNR region, we 
observe that the DF-NC scheme almost achieves the sum-capacity UB, and that the proposed EDA-PNC 
scheme is inferior to the DF-NC scheme. This is due to the inherited disadvantage of nested lattice codes 
in the low SNR region [2]. 

Next, we show the numerical result of the proposed EDA-PNC scheme for MIMO TWRCs with n R = 2 
and riT = 2,3,4. In Fig. @] two performance curves of the proposed EDA-PNC scheme are illustrated. 
One is based on the exhaustive search method, and the other is based on the approximate solution II (AS- 
II) method developed in Section VI. The sum-capacity UBs of the MIMO TWRCs and the performance 
curves of the DF-NC scheme are also plotted. In the medium-to-high SNR region, the gap between the 
proposed EDA-PNC scheme and the sum-capacity UB of the MIMO TWRC diminishes as n T increases. 
This agrees well with the asymptotic optimality of EDA-PNC, as stated in Theorem [2] We also see that 
there is a tiny gap between the optimal EDA-PNC curve (obtained from the exhaustive search) and the 
one based on AS-II. This implies that the proposed AS-II algorithm is nearly optimal for n R = 2. 

B. Achievable Rates of MIMO TWRCs with n T > n R = 4 

Now, we consider complex- valued MIMO TWRCs with tit > n R = 4. The channel coefficients are 
now independently drawn from CA/"(0,1). In this case, the complexity of exhaustive search in finding the 
the optimal EDA precoder is prohibitively high. Thus, we confine our results to the approximate solutions 
developed in Section VI. 
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1) Achievable Sum-Rates: In Fig. [51 we plot the achievable sum-rate of the proposed EDA-PNC scheme 
with n T = n R = 4. This figure also includes the performance curves of the other schemes considered in 
Fig. [3] The only difference is that the AS-II algorithm is used in plotting the performance curve of the 
proposed EDA-PNC scheme. Comparing Fig. [5] with Fig. [3J we see that the relative performance trends of 
these schemes are quite similar, except that the gap between the proposed EDA-PNC and the capacity UB 
is slightly larger (about 1.4 dB in power efficiency in the high SNR region) in Fig. [5J We conjecture that 
this performance degradation is mainly due to the sub-optimality of AS-II. We will seek for the possibility 
of improving AS-II in our future work. 

In Fig. [6l we further study the impact of ut on the achievable sum-rate in the case of ur = 4. Similar 
to Fig. we see that the proposed EDA-PNC scheme asymptotically approaches the capacity UB as ut 
increases. It is also worth mentioning that, for n T = 8 and tir = 4, the proposed EDA-PNC scheme can 
increase the spectral efficiency by more than 50% relative to the DF-NC scheme, at a practical SNR level 
(e.g., SNR=15 dB). In addition, we compare the performance of AS-I and AS-II algorithms in Fig. [6l We 
see that AS-II always slightly outperforms AS-I. For this reason, we only include the performance curves 
of AS-II in the other figures presented in this paper. 

2) Achievable Rate-Regions: We next show the achievable rate-region of the proposed EDA-PNC 
scheme (based on AS-II). The results for the case of ut = ur , = 4 is shown in Fig. [7J at SNR = 0, 
10, 15, 25 dB. We also include the rate-regions of the capacity UB, the DF-NC scheme and the naive 
EDA-PNC scheme. Clearly, the proposed scheme achieves a significantly larger rate-region relative to 
the DF-NC scheme and the naive EDA-PNC scheme, at a medium-to-high SNR. For a SNR of 15 dB, 
the proposed EDA-PNC scheme outperforms the DF-NC scheme, whereas the naive EDA-PNC scheme 
is worse than the DF-NC scheme, for the entire rate-region. Compared to the naive EDA precoding, the 
performance gain achieved by the proposed EDA precoding is significant. For low SNRs, e.g., SNR = 
dB, the achievable rate-region of DF-NC is very close to the capacity outer bound of the MIMO TWRC 
and is better than that of the EDA-PNC scheme. This is in agreement with the observations in Figs. [MU 

Finally, in Fig. [8l we plot the achievable rate-region of the proposed EDA-PNC scheme with n T = 8 
and ur = 4. Comparing to Fig. [/J we observe that the gap between the achievable rate-region of the 
proposed EDA-PNC scheme and the capacity outer bound of MIMO TWRC becomes smaller for the 
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entire SNR range. This agrees well with Theorem 2. 

In summary, the results shown in Fig. [3}{8] clearly demonstrates the benefits of the proposed EDA-PNC 
scheme for MIMO TWRCs. 

VIII. Conclusions 

In this paper, we proposed an EDA-PNC scheme to approach the capacity of a MIMO TWRC. The 
proposed EDA precoder efficiently creates ur aligned parallel channels for the two users, which provides 
a platform to perform multi-stream PNC. In such a manner, the benefits of PNC can now be exploited 
in a MIMO two-way relay system. We derived an achievable rate of the proposed EDA-PNC scheme 
and showed that, as n T /riR increases (towards infinity), the proposed EDA-PNC scheme approaches the 
capacity upper bound of a MEMO TWRC. For a finite n T , numerical results demonstrated that there is 
only a marginal gap between the achievable rate of the proposed scheme and the capacity upper bound, 
and the proposed scheme clearly outperforms the existing benchmark schemes. It is worth mentioning 
that the discussions in this paper is limited to the situation of n T > n R . The extension of this work to 
the case of n T < n R requires a dimension reduction method and is of interest for future work. 

Appendix I Treatment for a Complex- Valued Model 

The results of this paper derived based on a real-valued system model can be readily extended to the 
case of a complex- valued model. The key observation is that every complex- valued system model can be 
equivalently expressed in a real-valued form. 

For example, suppose that the uplink channel model in (OQ) is complex-valued. It can be equivalently 
expressed in a real-valued form as 

_ ? (Xr) 

where JH(-) and 3 (■) denote the real part and imaginary part of a complex- valued matrix (or a vector), 
respectively. 

It is noteworthy that the above relationship also applies to the downlink channel model ©. In this way, 
the results obtained for the real-valued system are directly applicable to a complex-valued system. 



m£{A,B} 



9t (H m ,R) (H^jj) 
3 (H m) #) 91 (H mjfl ) 















+ 






3(X m ) 




3(Z r ) 



(43) 
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Appendix II Proof of Theorem Q] 

Here we only provide a sketch of the proof. We refer the interested readers to [2] (cf., proof of Th.l 
in [2]) for more details. 



A. Uplink Achievable Rate-Pair 

Recall from (|T6b that the n R aligned eigen-modes (sub-channels) created by EDA precoding can be 
written in an entry-by-entry form as 



VR,i [I] = (i, i) c A ,i [I] + (i, i) c B ,i [I] + zr,i [I] , I = 1, 



,n,i = l, 



n R . 



(44) 



where y nR;i [I] (or z nRj [I]) represents the z'th entry of Y R [I] (or Z R [/]). 

1) Encoding: The construction of nested lattice codes for each sub-channel i follows exactly from [2]. 
Let C mi , m E {A, B}, be the codebook of user m for the ith sub-channel, and 2 nRm > i be the size of C„ hi . 
To deliver a message in the ith sub-channel, user m chooses a codeword W m ,i G C TO) j associated with the 
message. After a random dithering and a module-lattice operation [2], a length-n signal sequence 

is generated which will be transmitted in the ith sub-channel. The above encoding operation is performed 
for all n R sub-channels. 



2) Decoding (the bin-index) at the Relay: Upon receiving |Yr [Z] j , the relay computes the so-called 
"bin-index" Tj instead of Wa,% an d Wb,% for each sub-channel i, i = 1, • • • ,n R . (See the definition of 
bin-index in [2]). From Theorem 3 in [2], the error probability of recovering the bin-index Tj at the relay 
is arbitrarily small as n — > oo if 



R 



A.i 



R 



B.i 



< 



< 



log 



loe 



(i,i) 



ty B (i,i) 



-I + 



+ ^b 



(45a) 
(45b) 



where i — 1, 



,n R . 

Since the aligned sub-channels are orthogonal to each other, the rate-pair (Ra, Rb) with which all the 
bin-indices {Tj}™^ can be recovered correctly is given by (1 1 9ab and (|19b|) . 
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B. Downlink Achievable Rate-Pair 

1) Relay's Encoding: Define a "super bin-index" as T = JTi,T 2 ,--- ,T nR ] and assume that T is 



t£0 



We generate 2 hRa n R -by-n codeword 



recovered correctly by the relay. Also, assume that Ra > Ri 
matrices with each column drawn independently from a multi-variant Gaussian distribution with zero 
mean and covariance Q R . This forms a rate-i?^ codebook Cr (whose generation is independent of the 
codebooks {C m ,i, • • • , C m , nR } used in the uplink phase). The codebook Cr is employed to map each super 
bin-index T into a codeword in Cr. Denote by X fi (T) the codeword in Cr mapped to T. Then, X R (T) 
is transmitted over the ur antennas at the relay. 

2) Decoding of the Two Users: Upon receiving Y^, user A decodes Ta, by finding in C^ L a codeword 
that is jointly typical with Y^. Here, C® 1 is constructed by selecting the codewords in Cr corresponding 
to [M^4,i, • • • , W^nJ (which are perfectly known to user A). Note that the cardinality of C% L is 2 nRB 
[2]. From the argument of random coding and jointly typical decoding [11], we have Pr [Ta ^ T] — > 
as n — > oo if 

Rb < Rt D A = \ log det (I + Hb.aQbH^) . (46a) 

With T = [Ti, • • • , T nR ] and ■ - • , W^^Jj user A can uniquely determine the messages of user B 

using the method described in [2]. 

Similarly, user B can reliably determine the messages of user A if 

Ra < R E A % = \ log det (I + Hfl,flQ B H£ fl ) . (46b) 
Combining (|19a|) , (|19b|) , (|46al) and (|46b| ), we complete the proof of Theorem \T\ 

Appendix III Proof of Theorem [2] 

Proof: [Proof of Theorem [2) Since the downlink rate-pair of the EDA-PNC scheme is identical to 
that of the capacity UB, we only need to consider the uplink rate-pair. Specifically, we need to show that 

R U m B UL ~ Rm D UL ^ for n T -> oo, m G {A, B} . (47) 
5 The derivation for the case of Ra < Rb will be similar. 
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Clearly, R^ B UL and R^ul 316 continuous functions of H m fi H^ R . From the property of convergence 
in probability (cf., Theorem 4, pp. 261 of [30]), to prove (1471) , it suffices to show that, if 

1 



H m>R H^ R ->■ I, as n T -»■ oo, 



then 



From (1711) and (|7bl) . we have 



tit 



jjl I! T>tL>.\ 



~ K ul -> 0, for n T -> oo. 



m,UL m,UL 



(48) 



(49) 



= ^ log det (I + H m , K Q m H^) . 



1 



(50) 



Let P m ,T be the power allocated to user m, m e {A, £?}. With (|48|) . it can be shown that as tit — > oo, 
the optimal Q m takes the form of 



(51) 



Thus, as n T — >■ oo, we obtain 
1 



r>UB 
Il m,UL 



t _/ ^j y i r jT l '~J~i rj~> 

log det IH !— H m B H B H m R 

2 V n R n T 



(52a) 



i i i , , r i T p, 

-n R log n T + - log det — I- 

1 1 

-n R \ogn T + -log det — —I 

2 2 V n R 

— log - +0(1), 

2 \ n R J 



n T n R \n T 
Pm ' TT +o(l) 



(52b) 
(52c) 



where (I52al) follows by substituting ((511 into d50]), and (I52bl) follows from (l48l) . 
Now, consider the EDA precoder 

F m = H_l R (U m>R H^ tR ) _1 K* m ,me{A,B}. 

We choose 

K = I and * m = 
The power constraint is asymptotically met, i.e., 



n T Pm,i 
n R 



(53) 



(54) 



Tr{ FmF m 



n T P m ,T I P r) 

)■ ->■ -P m ,T, for n T -> oo. 



(55) 
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The choice of K and & m in (|54l ) is in no sense optimal. However, we will show that this suboptimal 
choice is sufficient to prove (1491) . To see this, the uplink achievable rate of the EDA-PNC scheme is given 
by 



1T.R r 



t>EDA 



E 

i=i 



log 



— h ^ 2 a % 



— log + o{l), as n T -»■ oo, 

2 \ n R ' 



(56) 
(57) 



where (1561) follows from Theorem 2 and (1571) is from ((541) together with the fact that (f{)+$ j\ ^ < 1> 

for i = 1, • • • ,n fl . 

Combining (152cl) and (l57l) . we arrive at (|49~1) . which completes the proof of Theorem [2] ■ 



Appendix IV Proof of Lemma [3] 

Proof: [Proof of LemmaO The objective function (1371) is jointly concave in {E (1, 1) , ■ ■ • , £ (n R , n R )}. 
The Lagrangian of problem (l37l) is given by 



L(A,E(1,1),--- ,E(n a ,n a )) 



E 



k 



log 



' + £(z,z) 2 



1 + 7 2 
-Aj^ A g(7 ) E 



1 — a v-^ 

+ — E 



log 



7 , 2W.- ,\2 



l + 7 ; 



(58) 



i=l 



where A is a non-negative scalar. The partial derivative of the Lagrangian in (|58l) with respect to each 
E (z, z) 2 is given by 



dL 



if z G SU and i E S 



B 



<9(E(z,z) 2 ) 



1 

1+7 2 V ' 

1 1-q 

2 =^(^F 



— AA G ( 7 ) (z, z) , if z G Sa and z ^ 5 



B 



AA 



G( 7 ) l*> > 



0. 



if z ^ and i E Sb 
if i S A and z ^ 



where z = 1, ..,n R . The Karush-Kuhn-Tucker condition is 

OL ! - 0, if £ fi, z) > 



<9(£(M) ) < , if E(v 



(59) 







which yields the solution ([38 
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Appendix V Proof of Lemma [4] 
Without loss of generality, let the SVD of KS be 

KE = UDV T (60) 

where the diagonal elements of £ and D are both arranged in the descending order. From (|60l ), the rank 
of D is the same as £ (as K, U, and V are all of full rank). This implies that, if E = for any 
index i, then D (i, i) = 0. 

Let us first consider that £ has full rank. We will relax this constraint later. Using (|32|) . we obtain 

Tr(G( 7 )UD 2 U T ) = Tr (U g(7) A g(t) U^ (7) UD 2 U t ) > Tr (A G(7) • D 2 ) 

where the diagonal entries of A G ( 7 ) are arranged in the ascending order, and the equality holds when 
U = U GW . 

Then, the optimization problem in (14 1 ab and (14-lbb can be expressed as 

minTr(A GH ■ D 2 ) (61a) 

D,V v K " ' 

S.t. 

Note that the diagonal elements of £~ 2 and D 2 are both arranged in the ascending order. Denote the 

diagonal entries of £~ 2 by [a±, ■ • • cr nR \ and those of D~ 2 as [d\, • • ■ ,d nR }. From Th. 4.3.32 of [32], 

for any [d a , • • • , d nR ] majorized by [a\, ■ ■ ■ cr nR ], there always exists a unitary matrix V satisfying (I61bl) . 

Therefore, the optimization problem specified in (16 lab and (I61bl) becomes 

. ^A G ( 7 )(z,i) 
mm y — (62) 

di,— ,dn R ~^ di 

subject to the majorization constraint as [32] 

di > 0Vt'6{V- ,n R } (63) 

d\ < d 2 < ■ ■ ■ < d nR 
t t 
^di < ^a h t= 1, ••• ,n R -l, 

i=l i=l 
n R n R 

J2 di = J2 ai - 
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We next show that, with £ given by (|40l ), the solution to the optimization problem (|41a| ) (|41b|) is given 



by 



di = <ji,V i = 1, ...,nji. 



(64) 



To prove (1641 ), we need some facts, as detailed below. 
Fact 2: For any z, j G {1, • • • , n R } with i < j, we have 

A G M . Ag {],]) 



Proof: [Proof of Fact 2] From (l40l) . we have 



1_ 



2AA G(7) (i,i) l+ 7 2 



(65) 



(66) 



where A > 0. Note that Oi > for all 2 = 1, ...,Ur (as £ is of full rank). Then, 

Of (2AA G( 1 7) (j,j) ~ w) (1) (2AA G( 1 y) (j,j)) (2) 2AA G(7) _ A G ( 7 ) (M) 

- 2 ' - - ^ 2 - ' - ^ - 1 A G(T) (j,j) : 



"3 



2AA G(7) (i,i) l+ 7 ; 



2AA G(7) (M) 



(67) 



2AA G(7) (i,i) 



(1) (2) 

where steps < and < are both from „, , , . , N < n . . ,. .,. (as the diagonal elements of AaM are 

r — — 2AA g{7) 0j) — 2AA G{7) (i,i) v => ^TO 



arranged in the ascending order). This yields (|65» . 

From (1631) . we see that, for any [di, ■ ■ • , d nR ] in the feasible region, d\ < oi and d rtif > a 
the objective function in (|62|) as 



f{dx,--- ,d nR ) = 

i=l 



A 



g(t)_(M) 
d,- 



■h > Denote 



(68) 



For any c?i < <7i and d nfl > <7 nH , let e be a non-negative number satisfying 



d\ + £ < o\ and d n „ — e > a nR . 



(69) 



Fact 3: 



f(dx + £,■■■ ,d nR -e) < f (di, ■ ■ ■ ,d, 



(70) 



where equality holds when e = 0. 
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Proof: [Proof of Fact 3] We have 



f(dt + e,--- , d nR -e)-f (di, ■ • • ,d nR . 



Ag( 7 ) (1, 1) A G(7) {n R ,n R ) _ f A G(7) (1, 1) A G(7) 7i fl ) 



(ii + 6 d nR — e \ d\ d nR 



-A G(7) (1, 1) 5 + A G(7) (n R , n R ) 



(di + e) c?i l7; (d nH - e) d 



( a ) e e 



< -A G(7) (1, 1) 2 + A G(7) (n R , n R ) — ^ 

(di + £) - e) 



A G ( 7) (1, 1) A G(7) (n R ,n R ) 



(di + e) (o? nfl - £ 



,2 



(g / A G(7) (1, 1) A G(7) (n^ng) 

(c) 

< 

where step (a) is self-evident, step (b) follows from (|69l ) and step (c) follows from (|65l ) in Fact 1. The 
equalities in steps (a)-(c) hold when e = 0, which completes the proof. ■ 
Fact 3 implies that the objective function / {d\ + e, - • • , c? nj? — e) , with e constrained by (l69l) . is 
minimized when £ = a\ — d\ ox £ = a nR — d nR . Therefore, the optimum of the problem in (|62|) is 
achieved at either d\ = o\ or d nR = a nR . Without loss of generality, we assume that d\ = <j\. Then, 
the dimension of the problem in (|62|) reduces from n R to n R — 1. Applying the same reasoning to this 
(n R — l)-dimension problem, we can further show that d 2 = cr 2 . Continuing this process, we eventually 
have (|64l) . or equivalently, 

D = S. 

Therefore, from (|60l) and the uniqueness of SVD, we obtain V = I and K^]= U G ( 7 ) = K^]. 
Next, consider that S does not have full rank. Define 

£ = E + tfv 7 ^ 

where 5 is an arbitrary positive number, and A is a diagonal matrix with non-negative diagonal elements. 
We can properly choose such a A that: (a) S is of full rank; (b) For a sufficiently small S, Fact 2 always 
holds for £ (and so does Fact 3). To this end, we choose 

A(i,i) = 0, if E(i,i) ^ (71) 



2S 



and 



A G {i,i) . A G (j,j) . 

> ITTTT— v> lf = s U>j) = 0,2 < J. (72) 



a-mm) - a- 2 (j,j; 

We verify Fact 2 for the above choice of A. Noting that the diagonal entries of £ are arranged in 
descending order, we only need to consider three cases: (i) £ > 0, £ (j, j) > 0, z < j; (ii) £ 
> 0, £ = 0, i < j; (iii) £ = £ (j, j) = 0, i < j. From our previous proof, Fact 2 holds for 
case (i). For case (ii), Fact 2 can be guaranteed by letting 5 be sufficiently small. For case (iii), Fact 2 is 
guaranteed from (1721 . Thus, Fact 2 is guaranteed for a sufficiently small 5 and the chosen A. 
Now, consider the following optimization problem: 

min Tr (G (7) K£K T ) (73a) 

s.t. 



K 1 (K- 1 ) 



= I. (73b) 

diag 



Noting that £ is of full rank and Facts 2 and 3 hold for £, we see that the optimal solution to the above 
problem is K = K^j. Now, let 5 — > 0. From the continuity of the problem in (1731 , the optimal K is still 



given by K^Z{. This completes the proof of Lemma 0] 
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Uplink Phase 
Downlink Phase 



Fig. 1. Configuration of a MIMO TWRC. 




Fig. 2. Geometrical illustration of (a) Naive EDA precoding and (b) EDA precoding (with KK T = I), for a two dimension case. Here, 
H m ._R = \hm ,hm ],m € {A, B}. The dashed arrow and solid arrow denote users A and B, respectively. For A, since the correlation of 
its channel vectors is large, a significant power loss is suffered in the naive EDA precoding. The proposed EDA precoding can effectively 
avoid this loss by introducing a rotation. 
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Fig. 3. Achievable sum-rate of the proposed EDA-PNC scheme for a MIMO TWRC with tit = tlr = 2. 
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Fig. 4. Achievable sum-rate of the proposed EDA-PNC scheme for MIMO TWRCs with ur — 2, nr — 2, 3, 4. 
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Fig. 5. Achievable sum-rate of the proposed EDA-PNC scheme for a MIMO TWRC with tit = ur = 4. 
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Fig. 6. Achievable sum-rate of the proposed EDA-PNC scheme for MIMO TWRCs with tir — 4, nr 



= 4,6,8. 
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Fig. 7. Achievable rate-region of the proposed EDA-PNC scheme for a MIMO TWRC with n T = n R = 4, where SNR = 0, 10, 15, 25 
dB. 
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Fig. 8. Achievable rate-region of the proposed EDA-PNC scheme for a MIMO TWRC with n T = %,n R = 4, where SNR = 0, 10, 15, 25 
dB. 



